Overview

Dataset Statistics

Number of Variables 17
Number of Rows 153572
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 19.9 MB
Average Row Size in Memory 136.0 B
Variable Types
  • Categorical: 7
  • Numerical: 10

Dataset Insights

d is skewed Skewed
f is skewed Skewed
g is skewed Skewed
l is skewed Skewed
m is skewed Skewed
score is skewed Skewed
a has constant length 1 Constant Length
n has constant length 1 Constant Length
p has constant length 1 Constant Length
week_of_year has constant length 1 Constant Length
is_weekend has constant length 1 Constant Length
fraude has constant length 1 Constant Length
f has 32550 (21.2%) zeros Zeros
h has 10502 (6.84%) zeros Zeros
m has 12130 (7.9%) zeros Zeros
  • 1
  • 2

Variables


a

categorical

Approximate Distinct Count 4
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 10135752
  • The largest value (3) is over 17.8 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 3
2nd row 3
3rd row 3
4th row 3
5th row 3

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 153572
  • The top 2 categories (3, 1) take over 50.0%
  • The largest value (3) is over 17.8 times larger than the second largest value (1)
  • a has words of constant length

b

numerical

Approximate Distinct Count 77968
Approximate Unique (%) 50.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 2457152
Mean 0.7444
Minimum 0.4863
Maximum 0.9985
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • b is skewed left (γ1 = -0.5981)

Quantile Statistics

Minimum 0.4863
5-th Percentile 0.5761
Q1 0.7016
Median 0.7564
Q3 0.7996
95-th Percentile 0.8608
Maximum 0.9985
Range 0.5122
IQR 0.09791

Descriptive Statistics

Mean 0.7444
Standard Deviation 0.08379
Variance 0.00702
Sum 114321.4723
Skewness -0.5981
Kurtosis 0.4552
Coefficient of Variation 0.1126
  • b has 5678 outliers

d

numerical

Approximate Distinct Count 51
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 2457152
Mean 18.4347
Minimum 0
Maximum 50
Zeros 1613
Zeros (%) 1.1%
Negatives 0
Negatives (%) 0.0%
  • d is skewed right (γ1 = 0.71)

Quantile Statistics

Minimum 0
5-th Percentile 1
Q1 3
Median 11
Q3 33
95-th Percentile 50
Maximum 50
Range 50
IQR 30

Descriptive Statistics

Mean 18.4347
Standard Deviation 17.8667
Variance 319.2195
Sum 2.8311e+06
Skewness 0.71
Kurtosis -1.0076
Coefficient of Variation 0.9692
  • d is not normally distributed (p-value 7.833428244727166e-15)

f

numerical

Approximate Distinct Count 54150
Approximate Unique (%) 35.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 2457152
Mean 9.6864
Minimum -5
Maximum 78
Zeros 32550
Zeros (%) 21.2%
Negatives 1341
Negatives (%) 0.9%
  • f is skewed right (γ1 = 2.2183)

Quantile Statistics

Minimum -5
5-th Percentile 0
Q1 0.3467
Median 3
Q3 12
95-th Percentile 45
Maximum 78
Range 83
IQR 11.6533

Descriptive Statistics

Mean 9.6864
Standard Deviation 14.9789
Variance 224.3689
Sum 1.4876e+06
Skewness 2.2183
Kurtosis 4.7959
Coefficient of Variation 1.5464
  • f is not normally distributed (p-value 4.753349207911276e-22)
  • f has 15942 outliers

g

numerical

Approximate Distinct Count 41
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 2457152
Mean 6.4539
Minimum -1
Maximum 50
Zeros 0
Zeros (%) 0.0%
Negatives 109
Negatives (%) 0.1%
  • g is skewed right (γ1 = 6.6238)

Quantile Statistics

Minimum -1
5-th Percentile 2
Q1 6
Median 6
Q3 6
95-th Percentile 6
Maximum 50
Range 51
IQR 0

Descriptive Statistics

Mean 6.4539
Standard Deviation 5.8479
Variance 34.1978
Sum 991134
Skewness 6.6238
Kurtosis 44.3378
Coefficient of Variation 0.9061
  • g is not normally distributed (p-value 6.851713659541343e-25)
  • g has 16272 outliers

h

numerical

Approximate Distinct Count 49
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 2457152
Mean 12.3658
Minimum 0
Maximum 48
Zeros 10502
Zeros (%) 6.8%
Negatives 0
Negatives (%) 0.0%
  • h is skewed right (γ1 = 1.1282)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 4
Median 9
Q3 18
95-th Percentile 36
Maximum 48
Range 48
IQR 14

Descriptive Statistics

Mean 12.3658
Standard Deviation 11.0125
Variance 121.2763
Sum 1.899e+06
Skewness 1.1282
Kurtosis 0.7381
Coefficient of Variation 0.8906
  • h is not normally distributed (p-value 0.009759599514139345)
  • h has 4842 outliers

k

numerical

Approximate Distinct Count 153572
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 2457152
Mean 0.4997
Minimum 4.18e-06
Maximum 1
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • k is skewed left (γ1 = -0.0081)

Quantile Statistics

Minimum 4.18e-06
5-th Percentile 0.074
Q1 0.2834
Median 0.5017
Q3 0.7136
95-th Percentile 0.9231
Maximum 1
Range 1
IQR 0.4303

Descriptive Statistics

Mean 0.4997
Standard Deviation 0.2637
Variance 0.06952
Sum 76732.9757
Skewness -0.008111
Kurtosis -1.0373
Coefficient of Variation 0.5277

l

numerical

Approximate Distinct Count 78624
Approximate Unique (%) 51.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 2457152
Mean 1585.8479
Minimum 0
Maximum 6390
Zeros 1759
Zeros (%) 1.1%
Negatives 0
Negatives (%) 0.0%
  • l is skewed right (γ1 = 1.0494)

Quantile Statistics

Minimum 0
5-th Percentile 32.4433
Q1 418.1301
Median 1187.3495
Q3 2397
95-th Percentile 4602
Maximum 6390
Range 6390
IQR 1978.8699

Descriptive Statistics

Mean 1585.8479
Standard Deviation 1431.8627
Variance 2.0502e+06
Sum 2.4354e+08
Skewness 1.0494
Kurtosis 0.3858
Coefficient of Variation 0.9029
  • l is not normally distributed (p-value 2.0091315153107663e-09)
  • l has 2746 outliers

m

numerical

Approximate Distinct Count 70957
Approximate Unique (%) 46.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 2457152
Mean 210.6646
Minimum 0
Maximum 1040
Zeros 12130
Zeros (%) 7.9%
Negatives 0
Negatives (%) 0.0%
  • m is skewed right (γ1 = 1.3676)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 29.3672
Median 126.665
Q3 319.5595
95-th Percentile 721
Maximum 1040
Range 1040
IQR 290.1923

Descriptive Statistics

Mean 210.6646
Standard Deviation 230.7373
Variance 53239.6969
Sum 3.2352e+07
Skewness 1.3676
Kurtosis 1.2766
Coefficient of Variation 1.0953
  • m is not normally distributed (p-value 5.566042320496766e-19)
  • m has 6464 outliers

n

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 10135752
  • The largest value (1) is over 4.31 times larger than the second largest value (0)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 1
3rd row 1
4th row 1
5th row 1

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 153572
  • The top 2 categories (1, 0) take over 50.0%
  • The largest value (1) is over 4.31 times larger than the second largest value (0)
  • n has words of constant length

o

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 10212795

Length

Mean 1.5017
Standard Deviation 0.5
Median 2
Minimum 1
Maximum 2

Sample

1st row -1
2nd row 1
3rd row 0
4th row 1
5th row -1

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 77043
Decimal Number 153572
  • The top 2 categories (-1, 0) take over 50.0%
  • The largest value (1) is over 2.0 times larger than the second largest value (0)

p

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 10135752
  • The largest value (0) is over 1.92 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 1
3rd row 1
4th row 0
5th row 1

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 153572
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 1.92 times larger than the second largest value (1)
  • p has words of constant length

monto

numerical

Approximate Distinct Count 80320
Approximate Unique (%) 52.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 2457152
Mean 24.0385
Minimum 0.02
Maximum 79.51
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • monto is skewed right (γ1 = 1.0154)

Quantile Statistics

Minimum 0.02
5-th Percentile 5.3
Q1 11.1527
Median 20.4527
Q3 32.82
95-th Percentile 57.2335
Maximum 79.51
Range 79.49
IQR 21.6673

Descriptive Statistics

Mean 24.0385
Standard Deviation 16.153
Variance 260.918
Sum 3.6916e+06
Skewness 1.0154
Kurtosis 0.5719
Coefficient of Variation 0.672
  • monto has 3609 outliers

score

numerical

Approximate Distinct Count 101
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 2457152
Mean 58.345
Minimum 0
Maximum 100
Zeros 4101
Zeros (%) 2.7%
Negatives 0
Negatives (%) 0.0%
  • score is skewed left (γ1 = -0.3911)

Quantile Statistics

Minimum 0
5-th Percentile 4
Q1 27
Median 67
Q3 88
95-th Percentile 99
Maximum 100
Range 100
IQR 61

Descriptive Statistics

Mean 58.345
Standard Deviation 32.0836
Variance 1029.3574
Sum 8.9602e+06
Skewness -0.3911
Kurtosis -1.28
Coefficient of Variation 0.5499
  • score is not normally distributed (p-value 2.9271601805310722e-08)

week_of_year

categorical

Approximate Distinct Count 8
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 10135752

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 2
2nd row 1
3rd row 1
4th row 3
5th row 4

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 153572
  • week_of_year has words of constant length

is_weekend

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 10135752
  • The largest value (0) is over 4.47 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 1
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 153572
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 4.47 times larger than the second largest value (1)
  • is_weekend has words of constant length

fraude

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 10135752

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 153572
  • The top 2 categories (0, 1) take over 50.0%
  • fraude has words of constant length

Interactions

Correlations

Missing Values